Toxic Code Snippets on Stack Overflow
نویسندگان
چکیده
Online code clones are fragments that copied from software projects or online sources to Stack Overflow as examples. Due an absence of a checking mechanism after the has been Overflow, they can become toxic snippets, e.g., suffer being outdated violating original license. We present study on and their toxicity by incorporating two developer surveys large-scale clone detection. A survey 201 high-reputation answerers (33 percent response rate) showed 131 participants (65 percent) have ever notified 26 them (20 rarely never fix code. 138 (69 check for licensing conflicts between snippets Overflow's CC BY-SA 3.0. 87 visitors shows experienced several issues answers: mismatched solutions, incorrect buggy 85 not aware 3.0 license enforced 66 when reusing snippets. Our detection found pairs 72,365 Java 111 open source in curated Qualitas corpus. analysed 2,289 non-trivial candidates. investigation revealed strong evidence 153 project Overflow. 100 (66 be outdated, which 10 were harmful reuse. Furthermore, we 214 could potentially violate appear 7,112 times 2,427 GitHub projects.
منابع مشابه
Usage and Attribution of Stack Overflow Code Snippets in GitHub Projects
Stack Overflow (SO) is the largest Q&A website for software developers, providing a huge amount of copyable code snippets. Using those snippets raises maintenance and legal issues. SO’s license (CC BY-SA 3.0) requires attribution, i.e., referencing the original question or answer, and requires derived work to adopt a compatible license. While there is a heated debate on SO’s license model for c...
متن کاملStaQC: A Systematically Mined Question-Code Dataset from Stack Overflow
Stack Overflow (SO) has been a great source of natural language questions and their code solutions (i.e., question-code pairs), which are critical for many tasks including code retrieval and annotation. In most existing research, question-code pairs were collected heuristically and tend to have low quality. In this paper, we investigate a new problem of systematically mining question-code pairs...
متن کاملStack Overflow Query Outcome Prediction
Stack Overflow’s core mission is to create an online encyclopedia for all programming knowledge. In order to ensure quality content in the face of rapid growth, community moderators frequently close low quality questions, often asked by newcomers. In order to alleviate moderator burden and ease newcomers’ transition, we devise two classifiers to predict 1) whether a question will be closed and ...
متن کاملInteractive Synthesis of Code Snippets
We describe a tool that applies theorem proving technology to synthesize code fragments that use given library functions. To determine candidate code fragments, our approach takes into account polymorphic type constraints as well as test cases. Our tool interactively displays a ranked list of suggested code fragments that are appropriate for the current program point. We have found our system t...
متن کاملStack Overflow Considered Harmful? The Impact of Copy&Paste on Android Application Security
Online programming discussion platforms such as Stack Overflow serve as a rich source of information for software developers. Available information include vibrant discussions and oftentimes ready-to-use code snippets. Previous research identified Stack Overflow as one of the most important information sources developers rely on. Anecdotes report that software developers copy and paste code sni...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: IEEE Transactions on Software Engineering
سال: 2021
ISSN: ['0098-5589', '1939-3520', '2326-3881']
DOI: https://doi.org/10.1109/tse.2019.2900307